Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for packetfilter API and iptables driver #2835

Merged
merged 6 commits into from
Jan 25, 2024

Conversation

yboaron
Copy link
Contributor

@yboaron yboaron commented Dec 12, 2023

To allow smooth transition to nftables a new packetfilter component is introduced, packetfilter
provides an API to create chains, and rules in a generic way and should
include pluggable drivers for iptables and nftables.

Packetfilter also includes a static function that defines the driver to use.

This PR includes:

  1. Refactoring the current iptables code to be used as pluggable driver.
  2. Fake driver implementation for unit tests.
  3. Update clients to use packetfilter APIs

Generic named set API will be pushed in a separate PR.

check [1] for nftables support high level design .

Co-authored-by: Tom Pantelis [email protected]

[1]
https://docs.google.com/document/d/1PAjU61XUGaQ2qZZu_66clxadC997lsBGYcjydYEayR0/edit?usp=sharing

@submariner-bot
Copy link
Contributor

🤖 Created branch: z_pr2835/yboaron/packetfilter
🚀 Full E2E won't run until the "ready-to-test" label is applied. I will add it automatically once the PR has 2 approvals, or you can add it manually.

@yboaron yboaron force-pushed the packetfilter branch 3 times, most recently from 45c33e3 to 29c8246 Compare December 13, 2023 13:18
@tpantelis
Copy link
Contributor

Since the packetfilter package is intended to replace iptables, we had also discussed renaming/moving directories/files to maintain git history continuity. So first rename the iptables dir to packetfilter then move iptables.go under packetfilter/iptables as a starting point.

@yboaron yboaron force-pushed the packetfilter branch 8 times, most recently from aa36ab1 to 9712f16 Compare January 1, 2024 18:00
@yboaron yboaron added the ready-to-test When a PR is ready for full E2E testing label Jan 1, 2024
@yboaron yboaron force-pushed the packetfilter branch 5 times, most recently from b935856 to c2f5e8c Compare January 7, 2024 10:59
@@ -95,8 +96,8 @@ func (w *egressPodWatcher) onCreateOrUpdate(obj runtime.Object, _ int) bool {

logger.V(log.DEBUG).Infof("Pod %q with IP %s created/updated", key, pod.Status.PodIP)

if err := w.namedIPSet.AddEntry(pod.Status.PodIP, true); err != nil {
logger.Errorf(err, "Error adding pod IP %q to IP set %q", pod.Status.PodIP, w.ipSetName)
if err := w.pfIface.AddEgressPodIP(w.gnHandle, pod.Status.PodIP); err != nil {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suppose we have a use-case where we want all the pods from a certain namespace to use a specific egressIP, we create a namedIPSet which has a unique name created using the key (i.e., namespace/name) and and store the PodIPs that are part of that namespace. It is not clear if we are doing the same in the new code. Can you explain how we are handling it? Is gnHandle supposed to be a unique name derived based on the key?

@tpantelis
Copy link
Contributor

I pushed #2871 based on my comments to serve as a starting point for the general structure. I left the packetfilter interface the same as it was. I think the logical next step would be to generalize the interface as is done in this PR. Then I think it would be useful to implement the interface using nftables and see if any changes are needed. Eg, can InsertUnique, PrependUnique, UpdateChainRules etc be implemented in the adapter interface as they are now?

@sridhargaddam
Copy link
Member

I guess we need also the CNI (KindNet) switched to nftables , for full e2e testing on Kind, right ?

Yes. Kindnet programs certain IPtable rules mainly to perform MASQ for outbound traffic so that the pods can reach internet. As you are familiar, iptables-nft was created to facilitiate smooth migration to nftables and pretty much all the CNIs (along with kubeproxy) use iptables-nft under the hood today.
So, if the nftables handler in Submariner uses nft while the CNI uses iptables-nft AFAICT it should still continue to work, assuming we take care of the priority to enforce the order in which the rules have to be applied.

@yboaron yboaron force-pushed the packetfilter branch 2 times, most recently from 273ee7a to 6aced5b Compare January 16, 2024 07:45
@yboaron yboaron force-pushed the packetfilter branch 2 times, most recently from ca8fb1d to 259219e Compare January 16, 2024 12:07
@yboaron
Copy link
Contributor Author

yboaron commented Jan 16, 2024

With the latest version I pushed, we have the history pointing to the prev files for the relevant files [1] ,
do we need something else ?

[1]
git log --pretty=oneline --follow pkg/packetfilter/adapter.go
git log --pretty=oneline --follow pkg/packetfilter/packetfilter.go
git log --pretty=oneline --follow pkg/packetfilter/fake/packetfilter.go
git log --pretty=oneline --follow pkg/routeagent_driver/handlers/kubeproxy/packetfilter_iface.go
git log --pretty=oneline --follow pkg/routeagent_driver/handlers/kubeproxy/kp_packetfilter.go

@yboaron
Copy link
Contributor Author

yboaron commented Jan 16, 2024

Yes. Kindnet programs certain IPtable rules mainly to perform MASQ for outbound traffic so that the pods can reach internet.

My Q was more about the nftables tables Submariner should use/create,
we know that Kube-proxy uses its own table, and a CNI that uses iptables-nft use the default tables and chains similar to iptables while other CNIs that use nftables may create their own table/s.

I guess we can't assume that default tables/chains (iptables-nft) will always be there, WDYT? meaning we should create our own table.

@tpantelis
Copy link
Contributor

With the latest version I pushed, we have the history pointing to the prev files for the relevant files [1] , do we need something else ?

That looks good.

@sridhargaddam
Copy link
Member

My Q was more about the nftables tables Submariner should use/create, we know that Kube-proxy uses its own table, and a CNI that uses iptables-nft use the default tables and chains similar to iptables while other CNIs that use nftables may create their own table/s.

I guess we can't assume that default tables/chains (iptables-nft) will always be there, WDYT? meaning we should create our own table.

AFAICT, if some component is using "iptables" (with iptables-nft under the hood) it will anyway use the default tables (like nat, filter, mangle etc) and it can create it own chains (like KUBE-PREROUTING etc) with a rule in the default chain (like PREROUTING) to forward packets to its custom chain (KUBE-PREROUTING). While using iptables they will not be creating any custom tables.

However, when a component has fully migrated to support nft, it can decide to create additional tables within each family and these tables can be given custom names. The components can then configure higher priority to those tables so that it gets priority over the default tables.

Personally, I feel in nftables implementation, we can create our own tables with Submariner specific chains/rules and use a higher priority.

@yboaron yboaron force-pushed the packetfilter branch 2 times, most recently from 46070c3 to 1e9e055 Compare January 21, 2024 17:08
Copy link
Contributor

@tpantelis tpantelis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall looks good, just some min or things.

Please update the summary for 65c28b2 to reflect that we didn't add general MSS clamping and GN APIs. Also remove "1. Packetfilter component, auto-detecting hardcoded to iptables.".

Similarly please update the PR summary.

yboaron and others added 2 commits January 23, 2024 13:01
To allow a smooth transition to nftables a new packetfilter component
is introduced, packetfilter provides an API for creating chains and
rules in a generic way and should include pluggable drivers for
iptables and nftables.

Packetfilter also includes a static function that defines the driver
to use.

This commit includes:
- Refactoring the current iptables code to be used as pluggable driver.
- Fake driver implementation for unit tests.

Co-authored-by: Tom Pantelis <[email protected]>
Signed-off-by: Yossi Boaron <[email protected]>
@tpantelis tpantelis enabled auto-merge (rebase) January 25, 2024 12:14
@tpantelis tpantelis merged commit 636b457 into submariner-io:devel Jan 25, 2024
38 checks passed
@submariner-bot
Copy link
Contributor

🤖 Closed branches: [z_pr2835/yboaron/packetfilter]

@yboaron yboaron deleted the packetfilter branch February 14, 2024 12:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ready-to-test When a PR is ready for full E2E testing
Projects
No open projects
Status: Done
Development

Successfully merging this pull request may close these issues.

4 participants